Improving SVF with DISTBIC for Phoneme Segmentation
نویسندگان
چکیده
In this paper we examine an application for phoneme segmentation of DISTBIC, a two-pass, textindependent method traditionally used for speaker segmentation. The novelty of this paper is its experimentation with use of the spectral variation function (SVF), a simple non-parametric method for phone segmentation, as a replacement for the distance measure of the first pass of DISTBIC. In doing so we aim to produce a computationally efficient method for text-independent phoneme segmentation that provides good performance. Experiments are carried out on the TIMIT database. We give a performance comparison between the SVF as previously used for segmentation, our DISTBIC-SVF algorithm, and another state-of-the-art algorithm.
منابع مشابه
Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملDISTBIC: A speaker-based segmentation for audio data indexing
In this paper, we address the problem of speaker-based segmentation, which is the ®rst necessary step for several indexing tasks. It aims to extract homogeneous segments containing the longest possible utterances produced by a single speaker. In our context, no assumption is made about prior knowledge of the speaker or speech signal characteristics (neither speaker model, nor speech model). How...
متن کاملUnsupervised Phoneme Segmentation Using Transformed Cepstrum Features
One of the basic problems in speech engineering is phoneme segmentation, that is, to divide a speech stream into a string of phonemes. Automatic Speech Recognition (ASR) models often require reliable phoneme segmentation in the initial training phase, and Text-to-Speech (TTS) systems need a large speech database with correct phoneme segmentation information for improving the performance. Human ...
متن کاملSyllable Specific Unit Selection Cost Function Using a Tone Modeling Technique for Automatic Phonetic Segmentation of Hindi Speech Using HMM
This paper presents a technique of improving tone correctness in speech synthesis of a tonal language based on an average-voice model trained with a corpus from nonprofessional speakers speech. Unit selection-based concatenative synthesis is one of the widely used speech synthesis approaches. This approach overcomes the limitations of other synthesis techniques such as articulatory synthesis an...
متن کاملWavelet-based speaker change detection in single channel speech data
Speaker segmentation is the task of finding speaker turns in an audio stream. We propose a metric-based algorithm based on Discrete Wavelet Transform (DWT) features. Principal component analysis (PCA) or linear discriminant analysis (LDA) [1] are further used to reduce the dimensionality of the feature space and remove redundant information. In the experiments our methods referred to as DWT-PCA...
متن کامل